- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources3
- Resource Type
-
0003000000000000
- More
- Availability
-
12
- Author / Contributor
- Filter by Author / Creator
-
-
Li, Yichuan (3)
-
Candan, K. Selçuk (1)
-
Etesami, S Rasoul (1)
-
Gao, Yifan (1)
-
Guo, Ruocheng (1)
-
Jiang, Haoming (1)
-
Jiang, Meng (1)
-
Lee, Kyumin (1)
-
Li, Jundong (1)
-
Li, Shiyang (1)
-
Li, Zheng (1)
-
Liu, Huan (1)
-
Liu, Xin (1)
-
Luo, Zhuang (1)
-
Raglin, Adrienne (1)
-
Tan, Zhaoxuan (1)
-
Tang, Xianfeng (1)
-
Wang, Haodong (1)
-
Xu, Zexing (1)
-
Yin, Bing (1)
-
- Filter by Editor
-
-
Chiruzzo, Luis (1)
-
Ritter, Alan (1)
-
Wang, Lu (1)
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available April 1, 2026
-
Zhang, Zhihan; Li, Shiyang; Zhang, Zixuan; Liu, Xin; Jiang, Haoming; Tang, Xianfeng; Gao, Yifan; Li, Zheng; Wang, Haodong; Tan, Zhaoxuan; et al (, Association for Computational Linguistics)Chiruzzo, Luis; Ritter, Alan; Wang, Lu (Ed.)The instruction hierarchy, which establishes a priority order from system messages to user messages, conversation history, and tool outputs, is essential for ensuring consistent and safe behavior in language models (LMs). Despite its importance, this topic receives limited attention, and there is a lack of comprehensive benchmarks for evaluating models’ ability to follow the instruction hierarchy. We bridge this gap by introducing IHEval, a novel benchmark comprising 3,538 examples across nine tasks, covering cases where instructions in different priorities either align or conflict. Our evaluation of popular LMs highlights their struggle to recognize instruction priorities. All evaluated models experience a sharp performance decline when facing conflicting instructions, compared to their original instruction-following performance. Moreover, the most competitive open-source model only achieves 48% accuracy in resolving such conflicts. Our results underscore the need for targeted optimization in the future development of LMs.more » « lessFree, publicly-accessible full text available April 27, 2026
-
Guo, Ruocheng; Li, Jundong; Li, Yichuan; Candan, K. Selçuk; Raglin, Adrienne; Liu, Huan (, Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence)Networked observational data presents new opportunities for learning individual causal effects, which plays an indispensable role in decision making. Such data poses the challenge of confounding bias. Previous work presents two desiderata to handle confounding bias. On the treatment group level, we aim to balance the distributions of confounder representations. On the individual level, it is desirable to capture patterns of hidden confounders that predict treatment assignments. Existing methods show the potential of utilizing network information to handle confounding bias, but they only try to satisfy one of the two desiderata. This is because the two desiderata seem to contradict each other. When the two distributions of confounder representations are highly overlapped, then we confront the undiscriminating problem between the treated and the controlled. In this work, we formulate the two desiderata as a minimax game. We propose IGNITE that learns representations of confounders from networked observational data, which is trained by a minimax game to achieve the two desiderata. Experiments verify the efficacy of IGNITE on two datasets under various settings.more » « less
An official website of the United States government
